Goal-aware generative adversarial imitation learning from imperfect demonstration for robotic cloth manipulation

نویسندگان

چکیده

Generative Adversarial Imitation Learning (GAIL) can learn policies without explicitly defining the reward function from demonstrations. GAIL has potential to with high-dimensional observations as input, e.g., images. By applying a real robot, perhaps robot be obtained for daily activities like washing, folding clothes, cooking, and cleaning. However, human demonstration data are often imperfect due mistakes, which degrade performance of resulting policies. We address this issue by focusing on following features: (1) many robotic tasks goal-reaching tasks, (2) labeling such goal states in is relatively easy. With these mind, paper proposes Goal-Aware (GA-GAIL), trains policy introducing second discriminator distinguish state parallel first that indicates data. This extends standard framework more robustly desirable even demonstrations through goal-state promotes achieving state. Furthermore, GA-GAIL employs Entropy-maximizing Deep P-Network (EDPN) generator, considers both smoothness causal entropy update, achieve stable learning two discriminators. Our proposed method was successfully applied real-robotic cloth-manipulation tasks: turning handkerchief over clothes. confirmed it learns task-specific design. Video experiments available at URL.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generative Adversarial Imitation Learning

Consider learning a policy from example expert behavior, without interaction with the expert or access to reinforcement signal. One approach is to recover the expert’s cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a...

متن کامل

Multimodal Storytelling via Generative Adversarial Imitation Learning

Deriving event storylines is an effective summarization method to succinctly organize extensive information, which can significantly alleviate the pain of information overload. The critical challenge is the lack of widely recognized definition of storyline metric. Prior studies have developed various approaches based on different assumptions about users’ interests. These works can extract inter...

متن کامل

Multi-agent Generative Adversarial Imitation Learning

We propose a new framework for multi-agent imitation learning for general Markov games, where we build upon a generalized notion of inverse reinforcement learning. We introduce a practical multi-agent actor-critic algorithm with good empirical performance. Our method can be used to imitate complex behaviors in highdimensional environments with multiple cooperative or competitive agents. 1 MARKO...

متن کامل

Learning a Visual State Representation for Generative Adversarial Imitation Learning

Imitation learning is a branch of reinforcement learning that aims to train an agent to imitate an expert’s behaviour, with no explicit reward signal or knowledge of the world. Generative Adversarial Imitation Learning (GAIL) is a recent model that performs this very well, in a data-efficient manner. However, it has only been used with low-level, low-dimensional state information, with few resu...

متن کامل

Context-Aware Generative Adversarial Privacy

Preserving the utility of published datasets while simultaneously providing provable privacy guarantees is a well-known challenge. On the one hand, context-free privacy solutions, such as differential privacy, provide strong privacy guarantees, but often lead to a significant reduction in utility. On the other hand, context-aware privacy solutions, such as information theoretic privacy, achieve...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Robotics and Autonomous Systems

سال: 2022

ISSN: ['0921-8890', '1872-793X']

DOI: https://doi.org/10.1016/j.robot.2022.104264